In the last few years, we have seen the rise of deep learning applications ina broad range of chemistry research problems. Recently, we reported on thedevelopment of Chemception, a deep convolutional neural network (CNN)architecture for general-purpose small molecule property prediction. In thiswork, we investigate the effects of systematically removing and adding basicchemical information to the image channels of the 2D images used to trainChemception. By augmenting images with only 3 additional basic chemicalinformation, we demonstrate that Chemception now outperforms contemporary deeplearning models trained on more sophisticated chemical representations(molecular fingerprints) for the prediction of toxicity, activity, andsolvation free energy, as well as physics-based free energy simulation methods.Thus, our work demonstrates that a firm grasp of first-principles chemicalknowledge is not a pre-requisite for deep learning models to accurately predictchemical properties. Lastly, by altering the chemical information content inthe images, and examining the resulting performance of Chemception, we alsoidentify two different learning patterns in predicting toxicity/activity ascompared to solvation free energy, and these patterns suggest that Chemceptionis learning about its tasks in the manner that is consistent with establishedknowledge.
展开▼